An Efficient Text Classification Using Knn and Naive Bayesian
نویسنده
چکیده
The main objective is to propose a text classification based on the features selection and preprocessing thereby reducing the dimensionality of the Feature vector and increase the classification accuracy. Text classification is the process of assigning a document to one or more target categories, based on its contents. In the proposed method, machine learning methods for text classification is used to apply some text preprocessing methods in different dataset, and then to extract feature vectors for each new document by using various feature weighting methods for enhancing the text classification accuracy. Further training the classifier by Naive Bayesian (NB) and K-nearest neighbor (KNN) algorithms, the predication can be made according to the category distribution among this k nearest neighbors. Experimental results show that the methods are favorable in terms of their effectiveness and efficiency when compared with other classifier such as SVM. Keywords– Text classification; Feature selection; K-Nearest Neighbor; Naïve Bayesian
منابع مشابه
Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملAn Efficient Mining Model For Enhancing Text Classification Using k-NN
Text classification is a supervised technique that uses labeled training data to learn the classification system and then automatically classifies the remaining text using the learned system. Classification plays a vital role in many information management and retrieval tasks. Classification includes different parts such as text processing, feature extraction, feature vector construction and fi...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کامل